AdaBoost and Support Vector Machines for Unbalanced Data Sets

نویسنده

  • Chi Zhang
چکیده

Boost is a kind of method for improving the accuracy of a given learning algorithm by combining multiple weak learners to “boost” into a strong learner. The gist of AdaBoost is based on the assumption that even though a weak learner cannot do good for all classifications, each of them is good at some subsets of the given data with certain bias, so that by assembling many weak learner together, the overall accuracy is expected to be higher. Support Vector Machine (SVM) is a popular machine learning technique for solving classification and regression problems. In this project, LIBSVM tools of SVMs was used to solve classification problems. The AdaBoost.M1 algorithm utilized SVMs as component learners and the new algorithm was proved to boost the accuracy of unbalanced datasets sharply. In the best case, AdaBoost.M1 with SVM algorithm achieved accuracy improvement of 10%. However, AdaBoost was not always useful for performance boosting. In the worst case of the vowel dataset, the performance of AdaBoost.M1 with SVM was slightly worse than the grid search method. By exploring various aspects of AdaBoost.M1 with SVM algorithm, I found that the gamma update settings had an important impact on the accuracy. It effected the number of component learners as well as the generalization of each learner. Ideally, proper number of weak learners would fit in unbalanced training data very well.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Support vector machines for candidate nodules classification

Image processing techniques have proved to be effective for the improvement of radiologists’ diagnosis of lung nodules. In this paper we present a computerized system aimed at lung nodules detection; it employs two different multi-scale schemes to identify the lung field and then extract a set of candidate regions with a high sensitivity ratio. The main focus of this work is the classification ...

متن کامل

Support Vector Machines For Synthetic Aperture Radar Automatic Target Recognition

Algorithms that produce classifiers with large margins, such as support vector machines (SVMs), AdaBoost, etc. are receiving more and more attention in the literature. This paper presents a real application of SVMs for synthetic aperture radar automatic target recognition (SAR/ATR) and compares the result with conventional classifiers. The SVMs are tested for classification both in closed and o...

متن کامل

Parallel Tuning of Support Vector Machine Learning Parameters for Large and Unbalanced Data Sets

We consider the problem of selecting and tuning learning parameters of support vector machines, especially for the classification of large and unbalanced data sets. We show why and how simple models with few parameters should be refined and propose an automated approach for tuning the increased number of parameters in the extended model. Based on a sensitive quality measure we analyze correlati...

متن کامل

Face Recognition using Eigenfaces , PCA and Supprot Vector Machines

This paper is based on a combination of the principal component analysis (PCA), eigenface and support vector machines. Using N-fold method and with respect to the value of N, any person’s face images are divided into two sections. As a result, vectors of training features and test features are obtain ed. Classification precision and accuracy was examined with three different types of kernel and...

متن کامل

Microsoft Word - Finding More Non-supersingular Elliptic Curves for Pairing..

Ensemble learning algorithms such as AdaBoost and Bagging have been in active research and shown improvements in classification results for several benchmarking data sets with mainly decision trees as their base classifiers. In this paper we experiment to apply these Meta learning techniques with classifiers such as random forests, neural networks and support vector machines. The data sets are ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012